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DETAILED ACTION 
Claim Rejections - 35 USC § 102 

1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

2. Claims 6, 23, 24 and 25 are rejected under 35 U.S.C. 102(a) as being anticipated 
by Shipp (US 2004/0093384 A1 ). 

3. Regarding claims 6 Shipp shows a method comprising the steps of receiving an 
email message comprising a text body ([0064,0065]), an SMTP email address 
([0039.0043,0069]), and a domain name corresponding to the SMTP email address 
([0039,0045,0046]); 

tokenizing the SMTP email address to generate a token representative of the 
SMTP email address ([0039,0043,0063]) 

tokenizing the domain name to generate a token representative of the domain 
name ([0022]), and 

determining a spam probability from the generated tokens ([0014,0076]). 

4. Regarding claim 23, Shipp further shows a system comprising a text body 
([0064,0065]), an SMTP email address ([0039.0043,0069]), and a domain name 
corresponding to the SMTP email address ([0039,0045,0046]); 

tokenizing the SMTP email address to generate a token representative of the 
SMTP email address ([0039,0043,0063]) 
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tokenizing the domain name to generate a token representative of the domain 
name ([0022]), and 

determining a spam probability from the generated tokens ([0014,0076]). 

5. Regarding claim 24, Shipp further shows a system comprising means for 
receiving an email message comprising an SMTP email address ([0039.0043,0069]), 
and a domain name corresponding to the SMTP email address ([0039,0045,0046]); 

means for tokenizing the SMTP email address to generate a token representative 
of the SMTP email address ([0039,0043,0063]) 

means ofr tokenizing the domain name to generate a token representative of the 
domain name ([0022]), and 

and means for determining a spam probability from the generated tokens 
([0014,0076]). 

6. Regarding claim 25, Shipp further shows a computer-readable medium 
comprising a text body ([0064,0065]), an SMTP email address ([0039.0043,0069]), and 
a domain name corresponding to the SMTP email address ([0039,0045,0046]); 

tokenizing the SMTP email address to generate a token representative of the 
SMTP email address ([0039,0043,0063]) 

tokenizing the domain name to generate a token representative of the domain 
name ([0022]), and 
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determining a spam probability from the generated tokens ([0014,0076]). 

Claim Rejections • 35 USC § 103 

7. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

8. Claim 1 is rejected under 35 U.S.C. 103(a) as being unpatentable over Shipp 
(US 2004/0093384 A1) in view of Devine et al. (US 6,968,571 B2), hereafter Devine, 
further in view of Milliken et al. (US 2004/0073617 A1), hereafter Milliken, further in view 
of Anderson et al. (US 2004/0064537 A1), hereafter Anderson, further in view of 
Uuencode and MIME FAQ 

(http://web.archive.Org/web/20021217052047/http://users.rcn.com/wussery/attach.html), 
further in view of Gordon et al. (US 6,732,157 B1), hereafter Gordon, further in view of 
Sahami et al. (A Bayesian Approach to Filtering Junk E-Mail), hereafter Sahami. 

9. Regarding claim 1, Shipp shows a method comprising the steps of: 
receiving an email message from a simple mail transfer protocol (SMTP) server, 

the email message comprising ([0018,0023]) 
a text body ([0064,0065]) 

an SMTP email address ([0018,0023,0039,0045,0046]) 

a domain name corresponding to the SMTP email address ([0039,0045,0046]) 

an attachment ([0081]) 
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tokenizing the text body to generate tokens representative of words in the text 
([0064-0067]) 

tokenizing the SMTP email address to generate a token representative of the 
SMTP email address ([0039,0043,0069]) 

tokenizing the domain name to generate a token that is representative domain 
name ([0022]) 

as well as showing MD5 hasing ([0093]). 

Shipp does not show a 32-bit string indicative of the length of the email message, 
nor does Shipp show tokenizing the attachment and the steps comprising tokenizing 
said attachment, determining a probability value for each generated token, selecting a 
predefined number of interesting tokens, the interesting tokens being the generated 
tokens having the greatest non-neutral probability value; performing a Bayesian 
analysis on the selected interesting tokens to generate a spam probability; and 
categorizing the email message as a function of the generated spam probability. 

Devine shows utilizing a 32-bit string in a message header which is indicative of 
the total length of said message (col. 24 lines 52-67). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp with that of Devine in order to better identify 
message contents so as to facilitate leveraging common code for processing messages 
(Devine col. 23 lines 60-61). 

Shipp in view of Devine do not show tokenizing the attachment. 
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Milliken shows tokenizing the attachment to generate a token that is 
representative of the attachment, the tokenizing steps comprising the steps of 
generating a MD5 hash of the attachment ([0010-0013 and 0052]where MD5 hashes 
are inherently 128-bit). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp in view of Devine with that of Milliken in order 
to better identify spam email, as Shipp fails to consider spam email identification relating 
to attachments, an area which Milliken's disclosure provides guidance. 

Shipp in view of Devine and Milliken do not show appending the 32-bit string to 
the generated MD5 hash to produce a 160-bit number. 

Anderson shows ([0057-0059]) appending an MD5 hash (inherently 128-bits) to 
network transmission size information (shown by Devine to be said 32-bit string, and 
where 32 +128 is inherently 160). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp in view of Devine and Milliken with that of 
Anderson in order to better uniquely identify messages (Anderson [0057-0059]), leading 
to improved message spam identification. 

Shipp in view of Devine, Milliken and Anderson do not show UUencoding said 
160-bit number to generate a token representative of the attachment. 

Uuencode and MIME FAQ shows UUencoding a file. 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp in view of Devine, Milliken and Anderson 
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with that of Uuencode and MIME FAQ in order to store the message identification 
information (represented by the 160-bit number shown by Shipp in view of Devine, 
Milliken and Anderson) in a format easily exchanged over email (UUencode and MIME 
FAQ) since UUencoding produces an easily emailed file and since the disclosure of 
Shipp in view of Devine, Milliken and Anderson relates to email and files transferred 
over email. Furthermore, UUencoding is a prior art element, as shown in UUencode and 
MIME FAQ, and thus UUencoding the 160-bit number is combing a prior art element 
(UUencoding) to known methods (the known methods shown by Shipp in view of 
Devine, Milliken and Anderson) to yield predictable results (the results being a 
UUencoded item). 

Shipp in view of Devine, Milliken, Anderson and UUencode and MIME FAQ do 
not show determining a probability value for each of the generated tokens. 

Gordon shows determining a probability value for each of the generated tokens 
(col. 11 lines 15-55). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp in view of Devine, Milliken, Anderson and 
Uuencode and MIME FAQ with that of Gordon in order to better identify spam elements 
in messages (Gordon col. 1 1 lines 15-55). 

Shipp in view of Devine, Milliken, Anderson, UUencode and MIME FAQ and 
Gordon do not show selecting a predefined number of interesting tokens, the interesting 
tokens being the generated tokens having the greatest non-neutral probability value; 
performing a Bayesian analysis on the selected interesting tokens to generate a spam 
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probability; and categorizing the email message as a function of the generated spam 
probability. 

Sahami shows selecting a predefined number of interesting tokens, the 
interesting tokens being the generated tokens having the greatest non-neutral 
probability value; performing a Bayesian analysis on the selected interesting tokens to 
generate a spam probability; and categorizing the email message as a function of the 
generated spam probability (pg. 2, col. 2; pg. 4, col. 2; pg. 6, col. 1). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp in view of Devine, Milliken, Anderson, 
Uuencode and MIME FAQ and Gordon with that of Sahami in order to more accurately 
identify spam email. 

Shipp in view of Devine, Milliken, Anderson, Uuencode and MIME FAQ, 
Gordonm and Shami thus show claim 1. 

10. Claims 11, 12, 13,14, 26, 27, 28 and 29 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Shipp as applied to claim 6 above, and further in view of 
Gordon and Sahami. 

1 1 . Regarding claims 1 1 and 26, Shipp shows assigning a spam probability value, 
including considering the token representative of the SMTP email address 
([0018,0023,0039,0040-0043]) and the token representative of the domain name 
([0022]). 

Shipp does not show where spam probability values are assigned to individual 
tokens, but rather shows utilizing an aggregate value based on all tokens. 
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Gordon shows assigning probability values to individual tokens (col. 1 1 lines 15 - 

55). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp with that of Gordon in order to better identify 
spam elements in messages (Gordon col. 1 1 lines 15-55). 

Shipp in view of Gordon do not show generating a Bayesian probability values 
using the spam probability values assigned to the tokens. 

Sahami shows generating a Bayesian probability values using the spam 
probability values assigned to the tokens (pg.2, col. 2; pg. 4, col. 2; pg. 6, col. 1). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp in view of Gordon with that of Sahami in 
order to more accurately identify spam email. 

12. Regarding claims 12 and 27 Shipp in view of Gordon and Sahami further show 
comparing the generated Bayesian probability value with a predefined threshold value 
(Sahami, pg.2, col. 2; pg. 4, col. 2; pg. 6, col. 1). 

13. Regarding claims 13 and 28 Shipp in view of Gordon and Sahami further show 
categorizing the email message as spam in response to the Bayesian probability value 
being greater than the predefined threshold (Sahami, pg.2, col. 2; pg. 4, col. 2; pg. 6, 
col. 1). 

14. Regarding claims 14 and 29 Shipp in view of Gordon and Sahami further show 
categorizing the email message as non-spam in response to the Bayesian probability 
value being not greater than the predefined threshold (Sahami pg. 6 col. 1). 
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15. Claims 15, 16, 17, 30, 31, 32, 33 and 34 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Shipp as applied to claim 1 above, and further in view of 
Milliken. 

16. Regarding claims 15, 30, 31 and 32 Shipp shows a sytem, a computer-readable 
medium and method comprising the steps of receiving an email message comprising an 
attachment and tokenizing at least parts of said message (Shipp [0018,0023]) as well as 
determining a spam probability from the tokens (Shipp [0014,0076]). 

Shipp does not show tokenizing the attachment to generate a token 
representative of the attachment. 

Milliken shows tokenizing the attachment to generate a token representative of 
the attachment ([0010-0013]). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp with that of Milliken in order to better identify 
spam email, as Shipp fails to consider spam email identification relating to attachments, 
an area which Milliken's disclosure provides guidance. 

17. Regarding claims 16 and 33, Shipp in view of Milliken further shows receiving an 
email message including a text body (Shipp [0064,0065]). 

18. Regarding claims 17 and 34, Shipp in view of Milliken further show tokenizing the 
words in the text body to generate tokens representative of the words in the text body 
(Shipp [0064,0065]). 
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19. Claims 19, 20, 21,22, 35, 36, 37 and 38 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Shipp in view of Milliken as applied to claims 15-17 above, and 
further in view of Gordon and Sahami. 

20. Regarding claims 19 and 35, Shipp in view of Milliken show claim 17 and 34. 
Shipp in view of Milliken do not show assigning a spam probability value to each 

of the tokens representation of the words in the text body nor due they show assigning a 
spam probability value specifically to the attachment, but rather determining a single 
spam probability value utilizing the tokens. 

Gordon shows (Gordon, col. 11 lines 15 - 55), assigning a spam probability 
value to the token to each of the tokens representative of the words in the text body, 
and to the token representative of the attachment (Gordon, col. 11 lines 15-55, Milliken 
[0010-0013]). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp in view of Milliken with that of Gordon in 
order to better identify spam elements in messages (Gordon col. 1 1 lines 15-55). 

Shipp in view of Milliken and Gordon do not show generating a Bayesian 
probability value using the spam probability values assigned to the token. 

Shamai shows generating a Bayesian probability value using the spam 
probability values assigned to the token (Sahami, pg. 4 col. 2). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the disclosure of Shipp in view of Milliken and Gordon with that of 
Sahami in order to more accurately identify spam email. 
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21 . Regarding claims 20 and 36, Shipp in view of Milliken, Gordon and Sahami 
further show comparing the generated Bayesian probability value with a predefined 
threshold value (Sahami, pg. 4 col. 2). 

22. Regarding claims 21 and 37, Shipp in view of Milliken, Gordon and Sahami 
further show categorizing the email message as spam in response to the Bayesian 
probability value being greater than the predefined threshold (Sahami, pg.2, col. 2; pg. 
4, col. 2; pg. 6, col. 1). 

23. Regarding claims 22 and 38, Shipp in view of Milliken, Gordon and Sahami 
further show categorizing the email message as non-spam in response to the Bayesian 
probability value being not greater than the predefined threshold (Sahami, pg. 6 col. 1). 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to John M. Frink whose telephone number is (571) 272- 
9686. The examiner can normally be reached on M-F 7:30AM - 5:00PM EST; off 
alternate Fridays. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Andrew Caldwell can be reached on (571)272-3868. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



John Frink 
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